A SPARQL Extension for Generating RDF from Heterogeneous Formats
نویسندگان
چکیده
RDF aims at being the universal abstract data model for structured data on the Web. While there is effort to convert data in RDF, the vast majority of data available on the Web does not conform to RDF. Indeed, exposing data in RDF, either natively or through wrappers, can be very costly. Furthermore, in the emerging Web of Things, resource constraints of devices prevent from processing RDF graphs. Hence one cannot expect that all the data on the Web be available as RDF anytime soon. Several tools can generate RDF from nonRDF data, and transformation or mapping languages have been designed to offer more flexible solutions (GRDDL, XSPARQL, R2RML, RML, CSVW, etc.). In this paper, we introduce a new language, SPARQL-Generate, that generates RDF from: (i) a RDF Dataset, and (ii) a set of documents in arbitrary formats. As SPARQL-Generate is designed as an extension of SPARQL 1.1, it can provably: (i) be implemented on top on any existing SPARQL engine, and (ii) leverage the SPARQL extension mechanism to deal with an open set of formats. Furthermore, we show evidence that (iii) it can be easily learned by knowledge engineers that know SPARQL 1.1, and (iv) our first naive open source implementation performs better than the reference implementation of RML for big transformations.
منابع مشابه
Extensions of SPARQL towards Heterogeneous Sources and Domain Annotations
SPARQL is the W3C Recommended query language for RDF. My current work aims at extending SPARQL in two distinct ways: (i) to allow a better integration of RDF and XML; and (ii) to define a query language for RDF extended with domain specific annotations. Transforming data between XML and RDF is a much required, but not so simple, task in the Semantic Web. The aim of (i) is to enable transparent ...
متن کاملFlexible RDF Generation from RDF and Heterogeneous Data Sources with SPARQL-Generate
RDF aims at being the universal abstract data model for structured data on the Web. While there is effort to convert data in RDF, the vast majority of data available on the Web does not conform to RDF. Indeed, exposing data in RDF, either natively or through wrappers, can be very costly. In this context, transformation or mapping languages that define generation of RDF from nonRDF data represen...
متن کاملUpdate Semantics for Interoperability among XML, RDF and RDB - A Case Study of Semantic Presence in CISCO's Unified Presence Systems
XSPARQL is a transformation and querying language that provides an integrated access over heterogeneous data sources on the fly. It is an extension of XQuery which supports a subset of SPARQL and SQL to provide unified access over XML, RDF and RDB formats. In practical applications, data integration does not only require the integrated access over distributed heterogeneous data sources, but als...
متن کاملGénération de RDF à partir de sources de données aux formats hétérogènes
Résumé. Contrairement à ce que promeut le Web des données, les données exposées par la plupart des organisations sont dans des formats non-RDF tels que CSV, JSON, ou XML. De plus sur le Web des objets, les objets contraints préféreront des formats binaires tels que EXI ou CBOR aux formats RDF textuels. Dans ce contexte, RDF peut toutefois servir de lingua franca pour l’interopérabilité sémantiq...
متن کاملOn the Semantics of Heterogeneous Querying of Relational, XML and RDF Data with XSPARQL
XSPARQL is a transformation and query language that caters for heterogenous sources: in its present status it is possible to transform data between XML and RDF formats due to the integration of the XQuery and SPARQL query languages. In this paper we propose an extension of the XSPARQL language to incorporate data contained in relational databases by integrating a subset of SQL in the syntax of ...
متن کامل